Convergence of Deep Neural Networks to a Hierarchical Covariance Matrix Decomposition

نویسندگان

  • Nima Dehmamy
  • Neda Rohani
  • Aggelos K. Katsaggelos
چکیده

We show that in a deep neural network trained with ReLU, the low-lying layers should be replaceable with truncated linearly activated layers. We derive the gradient descent equations in this truncated linear model and demonstrate that –if the distribution of the training data is stationary during training– the optimal choice for weights in these low-lying layers is the eigenvectors of the covariance matrix of the data. If the training data is random and uniform enough, these eigenvectors can be found using a small fraction of the training data, thus reducing the computational complexity of training. We show how this can be done recursively to form successive, trained layers. At least in the first layer, our tests show that this approach improves classification of images while reducing network size.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cystoscopy Image Classication Using Deep Convolutional Neural Networks

In the past three decades, the use of smart methods in medical diagnostic systems has attractedthe attention of many researchers. However, no smart activity has been provided in the eld ofmedical image processing for diagnosis of bladder cancer through cystoscopy images despite the highprevalence in the world. In this paper, two well-known convolutional neural networks (CNNs) ...

متن کامل

Robust stability of fuzzy Markov type Cohen-Grossberg neural networks by delay decomposition approach

In this paper, we investigate the delay-dependent robust stability of fuzzy Cohen-Grossberg neural networks with Markovian jumping parameter and mixed time varying delays by delay decomposition method. A new Lyapunov-Krasovskii functional (LKF) is constructed by nonuniformly dividing discrete delay interval into multiple subinterval, and choosing proper functionals with different weighting matr...

متن کامل

Simultaneous Monitoring of Multivariate-Attribute Process Mean and Variability Using Artificial Neural Networks

In some statistical process control applications, the quality of the product is characterized by thecombination of both correlated variable and attributes quality characteristics. In this paper, we propose anovel control scheme based on the combination of two multi-layer perceptron neural networks forsimultaneous monitoring of mean vector as well as the covariance matrix in multivariate-attribu...

متن کامل

Deep-LMS for gigabit transmission over unshielded twisted pair cables

In this paper we propose a rapidly converging LMS algorithm for crosstalk cancellation. The architecture is similar to deep neural networks, where multiple layers are adapted sequentially. The application motivating this approach is gigabit rate transmission over unshielded twisted pairs using a vectored system. The crosstalk cancellation algorithm uses an adaptive non-diagonal preprocessing ma...

متن کامل

Learning Deep Architectures via Generalized Whitened Neural Networks

Whitened Neural Network (WNN) is a recent advanced deep architecture, which improves convergence and generalization of canonical neural networks by whitening their internal hidden representation. However, the whitening transformation increases computation time. Unlike WNN that reduced runtime by performing whitening every thousand iterations, which degenerates convergence due to the ill conditi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1703.04757  شماره 

صفحات  -

تاریخ انتشار 2017